Port securesystemslib.hash module#2815
Conversation
369642b to
7ac9f23
Compare
tuf/api/_payload.py
Outdated
| T = TypeVar("T", "Root", "Timestamp", "Snapshot", "Targets") | ||
|
|
||
|
|
||
| def _hash(algo: str) -> Any: # noqa: ANN401 |
There was a problem hiding this comment.
Python's hashlib does not make the Hash type public, hence I'm using Any and ask the linter for forgiveness.
tests/repository_simulator.py
Outdated
|
|
||
| SPEC_VER = ".".join(SPECIFICATION_VERSION) | ||
|
|
||
| _DEFAULT_HASH_ALGORITHM = "sha256" |
There was a problem hiding this comment.
IMO it's okay to make this a local default, and not re-use the one newly added to _payload.py, so that we can keep the latter internal.
There was a problem hiding this comment.
sure, I'd be fine with just a magic value as well: it's not really even a default, it's just what this repository decided to use
securesystemslib.hash is a small wrapper around hashlib, which serves two main purposes: * provide helper function to hash a file * translate custom hash algorithm name "blake2b-256" to "blake2b" with (digest_size=32). In preparation for the removal of securesystemslib.hash, this patch ports above behavior to tuf and uses the builtin hashlib directly where possible. related secure-systems-lab/securesystemslib#943 Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
7ac9f23 to
866409f
Compare
tuf/api/_payload.py
Outdated
| """Returns hashed file.""" | ||
| f.seek(0) | ||
| if sys.version_info >= (3, 11): | ||
| digest = hashlib.file_digest(f, lambda: _hash(algo)) # type: ignore[arg-type] |
There was a problem hiding this comment.
Mypy does not allow IO[bytes] for f here. This may be related to python/typing#659. Ignore seems fair.
jku
left a comment
There was a problem hiding this comment.
looks fine to me. Left some comments
tuf/api/_payload.py
Outdated
| digest_object = _hash(algorithm) | ||
| digest_object.update(data) | ||
| else: | ||
| digest_object = sslib_hash.digest_fileobject( | ||
| data, algorithm | ||
| ) | ||
| except ( | ||
| sslib_exceptions.UnsupportedAlgorithmError, | ||
| sslib_exceptions.FormatError, | ||
| ) as e: | ||
| digest_object = _file_hash(data, algorithm) | ||
| except (ValueError, TypeError) as e: |
There was a problem hiding this comment.
the different functionality is a little annoying here (you need to call update on one digest_object but not the other)...
- If the
_hashand_file_hashfunctions directly returned the hexdigest this code would be simpler and the functions would be properly annotated - on the other hand then _hash() would need a data argument (meaning you would have to split the lambda needed by hashlib.file_digest() into another function)
I don't really know what is the simplest... up to you
There was a problem hiding this comment.
Does 535a189 look better? I don't have very strong feelings about either way.
There was a problem hiding this comment.
No passionate argument here either. I think I like this new version better.
tests/repository_simulator.py
Outdated
|
|
||
| SPEC_VER = ".".join(SPECIFICATION_VERSION) | ||
|
|
||
| _DEFAULT_HASH_ALGORITHM = "sha256" |
There was a problem hiding this comment.
sure, I'd be fine with just a magic value as well: it's not really even a default, it's just what this repository decided to use
coveralls.io is down again so I'm not sure... but I'm guessing this is the "blake2b-256" special case -- otherwise we're probably exercising this code. |
Remove the "default" prefix, because it's not a default but rather a fixed value. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Consolidate interface of bytes hash and file hash helpers. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Yep, blake is the culprit. Let me add a test. |
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
|
@jku, want to re-approve after my latest changes? |
jku
left a comment
There was a problem hiding this comment.
Looks good. The only nit I have left is that multiple docstrings still refer to "securesystemslib default hash algorithm".
Default hash sha256 is now defined locally. Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Good eyes! I pushed a fix and will merge when ci returns. |
fixes secure-systems-lab#943 * Internal use does not need the additional features (custom blake algorithm name support and file hashing), and was replaced by direct calls to hashlib. * External users were updated to no longer require `securesystemslib.hash` (theupdateframework/python-tuf#2815, in-toto/in-toto#861) Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
fixes secure-systems-lab#943 * Internally, the features added by the module (i.e. custom blake name support, and file hashing) are not needed. Builtin hashlib is used now directly. * External users port the relevant features into their code base: theupdateframework/python-tuf#2815 in-toto/in-toto#861 Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
securesystemslib.hash is a small wrapper around hashlib, which serves two main purposes:
In preparation for the removal of securesystemslib.hash, this patch ports above behavior to tuf and uses the builtin hashlib directly where possible.
related secure-systems-lab/securesystemslib#943